feat: Add database backup and restore utilities by pmbrull · Pull Request #26607 · open-metadata/OpenMetadata

pmbrull · 2026-03-19T17:42:33Z

Summary

Adds DatabaseBackupRestore utility class for full database backup and restore via JDBI
Adds backup and restore CLI commands to OpenMetadataOperations
Adds MigrationTestRunner with MigrationTestCase interface for migration validation
Adds test-migration CLI command that restores a backup, runs pending migrations one by one, and executes before/after test assertions
Supports both MySQL and PostgreSQL using information_schema for dynamic table/column discovery
Automatically skips GENERATED columns during backup (entity tables store derived columns from JSON)
Backup creates a .tar.gz archive with metadata.json (timestamp, version, db type, per-table info) and one JSON file per table
Restore validates database type match, supports --force flag to truncate existing data, and handles FK constraints
Migration test summary prints to stdout for easy PR attachment

Usage

# Backup
./openmetadata-ops.sh backup -c conf/openmetadata.yaml --backup-path /path/to/backup.tar.gz

# Restore (fails if tables have data)
./openmetadata-ops.sh restore -c conf/openmetadata.yaml --backup-path /path/to/backup.tar.gz

# Restore with force (truncates existing tables first)
./openmetadata-ops.sh restore -c conf/openmetadata.yaml --backup-path /path/to/backup.tar.gz --force

# Test migrations against a backup
./openmetadata-ops.sh test-migration -c conf/openmetadata.yaml --backup-path /path/to/backup.tar.gz

Migration Test Framework

Define test classes per migration version following the naming convention:

package org.openmetadata.service.migration.tests.v1120; // for version 1.12.0

public class MigrationTest implements MigrationTestCase {
  @Override
  public List<TestResult> validateBefore(Handle handle) {
    // assertions before migration runs
  }
  @Override
  public List<TestResult> validateAfter(Handle handle) {
    // assertions after migration runs
  }
}

The test-migration command outputs a summary table:

============================================================
              Migration Test Summary
============================================================
 Source version  : 1.11.0
 Target version  : 1.12.0
 Database type   : MYSQL
 Backup timestamp: 2026-03-19T14:30:00Z
------------------------------------------------------------
 Migration | Test                         | Phase  | Result
------------------------------------------------------------
 1.11.1    | (no tests)                   |   -    |   -
 1.11.2    | column X exists              | BEFORE |  PASS
 1.12.0    | config migrated              | AFTER  |  FAIL
           |   Expected 5 rows, got 3     |        |
------------------------------------------------------------
 Total: 1 passed, 1 failed
============================================================

Test plan

Unit tests for extractDatabaseName (4 tests)
Unit tests for versionToPackage (5 tests)
Manual test: backup against a local MySQL database, verify archive structure
Manual test: restore to an empty database, verify data integrity
Manual test: restore without --force on non-empty database fails with table listing
Manual test: restore with --force truncates and restores successfully
Manual test: test-migration with a sample MigrationTest class

Introduces a new utility class that provides database backup and restore functionality for OpenMetadata's CLI. It discovers all tables via information_schema, filters out GENERATED columns, dumps everything to a .tar.gz archive with metadata, and restores from such archives. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

openmetadata-service/src/main/java/org/openmetadata/service/util/DatabaseBackupRestore.java

Copilot

Pull request overview

Adds an ops-level database backup/restore capability to OpenMetadata, wiring new backup / restore CLI commands into OpenMetadataOperations and implementing archive-based export/import via JDBI for MySQL/PostgreSQL.

Changes:

Introduces DatabaseBackupRestore utility to write/read a .tar.gz backup with per-table JSON plus metadata.json.
Adds backup and restore Picocli commands to OpenMetadataOperations.
Adds unit tests for extractDatabaseName JDBC URL parsing.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 9 comments.

File	Description
openmetadata-service/src/main/java/org/openmetadata/service/util/DatabaseBackupRestore.java	Implements table/column discovery, backup archive creation, and restore logic.
openmetadata-service/src/main/java/org/openmetadata/service/util/OpenMetadataOperations.java	Wires `backup` / `restore` commands into the ops CLI.
openmetadata-service/src/test/java/org/openmetadata/service/util/DatabaseBackupRestoreTest.java	Adds unit tests for JDBC database-name extraction helper.
openmetadata-service/pom.xml	Adds `commons-compress` dependency for tar/gzip support.

You can also share your feedback on Copilot code review. Take the survey.

openmetadata-service/src/main/java/org/openmetadata/service/util/DatabaseBackupRestore.java

+        });
+
+    try {
+      jdbi.useHandle(this::disableForeignKeyChecks);
+      restoreTablesFromArchive(backupPath, tablesMetadata);
+      LOG.info("Restore completed successfully");
+    } finally {
+      jdbi.useHandle(this::enableForeignKeyChecks);
+    }


openmetadata-service/src/main/java/org/openmetadata/service/util/DatabaseBackupRestore.java

+        byte[] content = tais.readAllBytes();
+        ArrayNode rows = (ArrayNode) MAPPER.readTree(content);
+


openmetadata-service/src/main/java/org/openmetadata/service/util/DatabaseBackupRestore.java

+    int offset = 0;
+    ArrayNode allRows = MAPPER.createArrayNode();
+
+    while (true) {
+      String sql =
+          String.format(
+              "SELECT %s FROM %s LIMIT %d OFFSET %d",
+              quotedColumns, quotedTable, BATCH_SIZE, offset);


openmetadata-service/src/main/java/org/openmetadata/service/util/DatabaseBackupRestore.java

+          JsonNode val = row.get(col);
+          if (val == null || val.isNull()) {
+            batch.bindNull(col, Types.VARCHAR);
+          } else if (val.isNumber()) {
+            if (val.isLong() || val.isInt() || val.isBigInteger()) {
+              batch.bind(col, val.longValue());
+            } else {
+              batch.bind(col, val.doubleValue());
+            }


openmetadata-service/src/main/java/org/openmetadata/service/util/DatabaseBackupRestore.java

+              "SELECT %s FROM %s LIMIT %d OFFSET %d",
+              quotedColumns, quotedTable, BATCH_SIZE, offset);


openmetadata-service/src/main/java/org/openmetadata/service/util/DatabaseBackupRestore.java

+  public void restore(String backupPath, boolean force) throws IOException {
+    LOG.info("Starting database restore from {}", backupPath);
+
+    ObjectNode metadata = readMetadata(backupPath);
+    String backupDbType = metadata.get("databaseType").asText();
+    if (!backupDbType.equals(connectionType.name())) {
+      throw new IllegalStateException(
+          String.format(
+              "Backup database type '%s' does not match current connection type '%s'",
+              backupDbType, connectionType.name()));
+    }
+
+    LOG.info(
+        "Backup info - version: {}, timestamp: {}, databaseType: {}",
+        metadata.get("version").asText(),
+        metadata.get("timestamp").asText(),
+        backupDbType);
+
+    ObjectNode tablesMetadata = (ObjectNode) metadata.get("tables");
+
+    jdbi.useHandle(
+        handle -> {
+          if (force) {
+            truncateAllTables(handle, tablesMetadata);
+          } else {
+            validateTablesEmpty(handle, tablesMetadata);
+          }
+        });
+
+    try {
+      jdbi.useHandle(this::disableForeignKeyChecks);
+      restoreTablesFromArchive(backupPath, tablesMetadata);
+      LOG.info("Restore completed successfully");
+    } finally {
+      jdbi.useHandle(this::enableForeignKeyChecks);
+    }
+  }


openmetadata-service/src/main/java/org/openmetadata/service/util/DatabaseBackupRestore.java

+          "SELECT column_name FROM information_schema.columns "
+              + "WHERE table_schema = 'public' AND table_name = :table "
+              + "AND (is_generated = 'NEVER' OR is_generated IS NULL) "
+              + "AND (column_default NOT LIKE 'nextval%' OR column_default IS NULL) "


openmetadata-service/src/main/java/org/openmetadata/service/util/DatabaseBackupRestore.java

+              batch.bind(col, val.doubleValue());
+            }
+          } else if (val.isBoolean()) {
+            batch.bind(col, val.booleanValue());


openmetadata-service/src/main/java/org/openmetadata/service/util/DatabaseBackupRestore.java

+  private void disableForeignKeyChecks(Handle handle) {
+    if (connectionType == ConnectionType.MYSQL) {
+      handle.execute("SET FOREIGN_KEY_CHECKS = 0");
+    } else {
+      handle.execute("SET session_replication_role = 'replica'");
+    }
+  }


…mary output Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Adds a test-migration command to OpenMetadataOperations that: - Restores a backup, then runs pending migrations one by one - For each migration with a test class, runs validateBefore/validateAfter - Prints a summary table to stdout for PR validation - Returns exit code 0 (all pass) or 1 (any fail) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

openmetadata-service/src/main/java/org/openmetadata/service/util/MigrationTestRunner.java

github-actions · 2026-03-19T19:48:04Z

🟡 Playwright Results — all passed (17 flaky)

✅ 3387 passed · ❌ 0 failed · 🟡 17 flaky · ⏭️ 183 skipped

Shard	Passed	Flaky	Skipped
🟡 Shard 1	452	3	2
✅ Shard 2	305	0	1
🟡 Shard 3	667	6	33
🟡 Shard 4	683	2	41
🟡 Shard 5	670	2	73
🟡 Shard 6	610	4	33

🟡 17 flaky test(s) (passed on retry)

Flow/Tour.spec.ts › Tour should work from welcome screen (shard 1, 1 retry)
Flow/Tour.spec.ts › Tour should work from URL directly (shard 1, 1 retry)
Pages/AuditLogs.spec.ts › should apply both User and EntityType filters simultaneously (shard 1, 1 retry)
Features/DataQuality/TestCaseImportExportBasic.spec.ts › Data Steward can successfully export test cases (shard 3, 1 retry)
Features/DataQuality/TestCaseIncidentPermissions.spec.ts › User with only VIEW cannot PATCH incidents (shard 3, 1 retry)
Features/DataQuality/TestCaseResultPermissions.spec.ts › User with only VIEW cannot PATCH results (shard 3, 1 retry)
Features/ImpactAnalysis.spec.ts › Verify Upstream connections (shard 3, 1 retry)
Features/Permissions/GlossaryPermissions.spec.ts › Team-based permissions work correctly (shard 3, 1 retry)
Features/RestoreEntityInheritedFields.spec.ts › Validate restore with Inherited domain and data products assigned (shard 3, 1 retry)
Pages/Customproperties-part2.spec.ts › entityReferenceList shows item count, scrollable list, no expand toggle (shard 4, 1 retry)
Pages/Domains.spec.ts › Rename domain (shard 4, 1 retry)
Pages/Entity.spec.ts › Set & Update table-cp, hyperlink-cp, string, integer, markdown, number, duration, email, enum, sqlQuery, timestamp, entityReference, entityReferenceList, timeInterval, time-cp, date-cp, dateTime-cp Custom Property (shard 5, 1 retry)
Pages/ExplorePageRightPanel.spec.ts › Should allow Data Consumer to edit tags for table (shard 5, 1 retry)
Pages/UserDetails.spec.ts › Admin user can get all the roles hierarchy and edit roles (shard 6, 2 retries)
Pages/Users.spec.ts › Permissions for table details page for Data Consumer (shard 6, 1 retry)
VersionPages/EntityVersionPages.spec.ts › Table (shard 6, 1 retry)
VersionPages/EntityVersionPages.spec.ts › Topic (shard 6, 1 retry)

📦 Download artifacts

How to debug locally

# Download playwright-test-results-<shard> artifact and unzip
npx playwright show-trace path/to/trace.zip    # view trace

- Use single Handle for entire restore (FK disable/enable + inserts on same session) - Add binary data handling (val.isBinary() check) in restore to decode Base64 - Add ORDER BY via primary key discovery to prevent non-deterministic pagination - Escape embedded quotes in quoteIdentifier to prevent SQL injection - Validate archive table names against metadata to reject unknown tables - Stream backup to temp file per table to avoid OOM on large tables - Stream restore via JsonParser to avoid loading entire table into memory - Use REPEATABLE READ transaction isolation for consistent backup snapshots - Switch to positional parameters (?) to handle columns with special chars - Replace hardcoded 'public' schema with current_schema() for PostgreSQL - Use bind(idx, (Object) null) instead of bindNull with Types.VARCHAR - Fix unused pattern variable (use 'number' binding in instanceof chain) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The val.isBinary() check in insertRowsStreaming was dead code because Jackson's writeBinaryField encodes byte[] as base64 strings in JSON, and readTree deserializes them as TextNode (not BinaryNode). This caused binary data to be inserted as raw base64 strings instead of decoded bytes. Store binary column names (discovered via information_schema.columns) in the backup metadata so restore can decode base64 back to byte[] for the correct columns. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…tion, extract commons-compress version - MigrationTestRunner: replace all System.out.println/printf with LOG.info in printSummary and centerText for proper logging - MigrationTestRunner: add LOG.warn for partial migration state on failure - OpenMetadataOperations: add --force flag to test-migration command to prevent accidental table truncation (mirrors restore command pattern) - pom.xml: extract hardcoded commons-compress 1.27.1 to parent POM property Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…adata, and versionToPackage edge cases Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…flow - Handle BigDecimal in backup to prevent truncation of DECIMAL/numeric values - Add MAX_METADATA_SIZE limit and use readNBytes for defensive metadata parsing - Move discoverBinaryColumns before data write for logical grouping - Eliminate double FK disable/enable by restructuring force-restore flow - Add DatasourceConfig.initialize to backup/restore CLI commands Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add --batch-size option to backup, restore, and test-migration commands. Defaults to 1000 rows. Applies to both SQL pagination during backup and insert batching during restore. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…d AFTER tests on failure - Add SAFE_IDENTIFIER regex to quoteIdentifier() for SQL injection defense-in-depth - Use underscore separators in versionToPackage to prevent version collisions (e.g. 1.12.0 vs 1.1.20) - Skip validateAfter when migration has failed to avoid confusing results Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

gitar-bot · 2026-03-20T08:25:32Z

Code Review ✅ Approved 5 resolved / 5 findings

Adds comprehensive database backup and restore utilities with CLI commands for migration validation and testing. Resolves critical issues including FK check isolation, SQL injection via backup metadata, snapshot isolation during reads, version collision handling, and test execution safety.

✅ 5 resolved

✅ Bug: FK checks disabled on separate handle, not effective for inserts

📄 openmetadata-service/src/main/java/org/openmetadata/service/util/DatabaseBackupRestore.java:244-250 📄 openmetadata-service/src/main/java/org/openmetadata/service/util/DatabaseBackupRestore.java:303
At line 245, jdbi.useHandle(this::disableForeignKeyChecks) disables FK checks on a handle that is immediately closed. Then restoreTablesFromArchive (line 246) calls jdbi.useHandle(handle -> insertRows(...)) at line 303, which obtains a different connection from the pool. SET FOREIGN_KEY_CHECKS = 0 (MySQL) and SET session_replication_role = 'replica' (Postgres) are session-level settings — they only apply to the connection they were executed on. Since each jdbi.useHandle() may obtain a different pooled connection, the inserts will run with FK checks still enabled, causing failures when tables are restored in an order that violates foreign key constraints.

The same issue affects the enableForeignKeyChecks call in the finally block (line 249) — it may re-enable checks on yet another connection that never had them disabled.

✅ Security: SQL injection via crafted backup archive table/column names

📄 openmetadata-service/src/main/java/org/openmetadata/service/util/DatabaseBackupRestore.java:397-402 📄 openmetadata-service/src/main/java/org/openmetadata/service/util/DatabaseBackupRestore.java:284 📄 openmetadata-service/src/main/java/org/openmetadata/service/util/DatabaseBackupRestore.java:292
During restore, table names are extracted from tar entry paths (line 284) and metadata.json field names (line 233) — both sourced from the untrusted backup archive. These names flow into SQL statements via quoteIdentifier() (line 397-402), which wraps identifiers in backticks or double-quotes but does NOT escape those delimiter characters within the identifier itself. A crafted archive with a table name like foo DROP TABLE users; --` could inject arbitrary SQL.

quoteIdentifier is used in SELECT (line 314), TRUNCATE (line 336), and INSERT (line 350) statements. Column names from metadata.json are similarly unvalidated (line 292, used in INSERT at line 347).

While the attack requires someone to provide a malicious .tar.gz file to the restore command, this is a realistic threat for a backup/restore utility — especially if backups are downloaded from remote storage.

✅ Edge Case: Backup reads tables without snapshot isolation

📄 openmetadata-service/src/main/java/org/openmetadata/service/util/DatabaseBackupRestore.java:119-127
The backup() method reads tables sequentially within a single jdbi.useHandle callback, but without setting a transaction isolation level (e.g., REPEATABLE READ or SERIALIZABLE). If the database is being written to during backup, the resulting archive may contain an inconsistent snapshot — e.g., a foreign key reference in table A pointing to a row in table B that wasn't captured because it was deleted between the two table reads.

✅ Bug: versionToPackage produces collisions for different versions

📄 openmetadata-service/src/main/java/org/openmetadata/service/util/MigrationTestRunner.java:167
The versionToPackage method simply concatenates the numeric parts of a version string without any separator or fixed-width formatting. This means different versions can map to the same package name, causing the wrong test class to be loaded (or no test to be found).

Examples of collisions:

"1.12.0" → "v1120" and "1.1.20" → "v1120"

"1.1.0" → "v110" and "1.10" → "v110"

If a future migration version happens to collide with an existing one, the test framework will silently run the wrong test class or fail to find the correct one.

✅ Edge Case: AFTER tests still run when migration fails

📄 openmetadata-service/src/main/java/org/openmetadata/service/util/MigrationTestRunner.java:113
When a migration fails (lines 106-111), the migrationFailed flag is set but the code only checks testCase != null before running validateAfter (line 113). The else if (!migrationFailed) branch at line 115 only applies to the "no tests" placeholder — if a testCase exists, its validateAfter will still execute against a partially-migrated database before the break at line 119. This could produce confusing test results or exceptions from the after-validation running against an inconsistent schema.

Options

Auto-apply is off → Gitar will not commit updates to this branch.
Display: compact → Showing less information.

Comment with these commands to change:

`Auto-apply`	`Compact`
`gitar auto-apply:on`	`gitar display:verbose`

_{Was this helpful? React with 👍 / 👎 | Gitar}

Copilot

Pull request overview

This PR adds operational utilities to OpenMetadata’s Java service to support full database backup/restore into a .tar.gz archive and introduces a migration testing runner that restores a backup and executes pending migrations with optional per-version before/after assertions.

Changes:

Add DatabaseBackupRestore utility (JDBI-based) and wire new backup, restore, and test-migration CLI commands into OpenMetadataOperations.
Add MigrationTestRunner plus a small migration test API (MigrationTestCase, TestResult) and expose migrations list via a getter on MigrationWorkflow.
Add unit tests covering JDBC DB-name extraction, identifier quoting, metadata reading, and migration version → package mapping.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 10 comments.

Show a summary per file

File	Description
pom.xml	Adds `commons-compress` version property used for archive support.
openmetadata-service/pom.xml	Adds Apache Commons Compress dependency for tar/gzip backup archives.
openmetadata-service/src/main/java/org/openmetadata/service/util/OpenMetadataOperations.java	Adds `backup`, `restore`, `test-migration` CLI commands.
openmetadata-service/src/main/java/org/openmetadata/service/util/DatabaseBackupRestore.java	Implements dynamic table/column discovery, tar.gz backup, and restore logic for MySQL/Postgres.
openmetadata-service/src/main/java/org/openmetadata/service/util/MigrationTestRunner.java	Implements restore+stepwise migration execution with before/after validation and summary output.
openmetadata-service/src/main/java/org/openmetadata/service/migration/api/MigrationWorkflow.java	Exposes migrations via Lombok `@Getter` to support the runner.
openmetadata-service/src/main/java/org/openmetadata/service/migration/api/MigrationTestCase.java	Defines the before/after validation interface for migration tests.
openmetadata-service/src/main/java/org/openmetadata/service/migration/api/TestResult.java	Adds a small result record with pass/fail helpers for assertions.
openmetadata-service/src/test/java/org/openmetadata/service/util/DatabaseBackupRestoreTest.java	Unit tests for DB-name extraction, identifier quoting, and metadata.json reading.
openmetadata-service/src/test/java/org/openmetadata/service/util/MigrationTestRunnerTest.java	Unit tests for `versionToPackage` parsing behavior.

Copilot · 2026-03-20T08:28:30Z

openmetadata-service/src/main/java/org/openmetadata/service/util/DatabaseBackupRestore.java

+          "SELECT column_name FROM information_schema.columns "
+              + "WHERE table_schema = current_schema() AND table_name = :table "
+              + "AND (is_generated = 'NEVER' OR is_generated IS NULL) "
+              + "AND (column_default NOT LIKE 'nextval%' OR column_default IS NULL) "


In Postgres, discoverColumns filters out columns whose column_default starts with nextval (sequence-backed/serial columns). That means those columns (e.g., openmetadata_settings.id) are omitted from the backup metadata and will not be restored, which breaks “full backup/restore” semantics and can also break FK relationships if any table references those IDs. Consider including these columns in backups and, after restore, resetting sequences to max(col) (or setval) so future inserts keep working.

Suggested change

+ "AND (column_default NOT LIKE 'nextval%' OR column_default IS NULL) "

Copilot · 2026-03-20T08:28:30Z

openmetadata-service/src/main/java/org/openmetadata/service/util/DatabaseBackupRestore.java

+  private void disableForeignKeyChecks(Handle handle) {
+    if (connectionType == ConnectionType.MYSQL) {
+      handle.execute("SET FOREIGN_KEY_CHECKS = 0");
+    } else {
+      handle.execute("SET session_replication_role = 'replica'");
+    }


For Postgres, SET session_replication_role = 'replica' generally requires superuser privileges. If OpenMetadata runs with a non-superuser DB user (common in managed Postgres), restore will fail before any data is loaded. Consider an alternative approach that works without superuser (e.g., restoring in FK-safe order, using deferrable constraints/SET CONSTRAINTS ALL DEFERRED where applicable, or documenting/enforcing the required privileges explicitly).

Copilot · 2026-03-20T08:28:31Z

openmetadata-service/src/main/java/org/openmetadata/service/util/DatabaseBackupRestore.java

+            } finally {
+              commitTransaction(handle);


The backup path wraps table reads in a transaction but always calls COMMIT in a finally block. If any SQL error occurs (especially on Postgres where the transaction becomes aborted), COMMIT will fail and may mask the original error; it also prevents a clean rollback. Consider committing only on success and issuing a ROLLBACK when an exception occurs (or use JDBI transaction helpers like useTransaction).

Suggested change

} finally {

commitTransaction(handle);

commitTransaction(handle);

} catch (Exception e) {

handle.rollback();

throw e;

Copilot · 2026-03-20T08:28:31Z

openmetadata-service/src/main/java/org/openmetadata/service/util/OpenMetadataOperations.java

+              description =
+                  "Number of rows per batch during restore. Default: "
+                      + DatabaseBackupRestore.DEFAULT_BATCH_SIZE)


The --batch-size option hardcodes defaultValue = "1000" but the description references DatabaseBackupRestore.DEFAULT_BATCH_SIZE. If the constant ever changes, CLI behavior and help text will diverge. Consider defining a single source of truth (e.g., keep the help text literal, or introduce a shared constant for both).

Suggested change

description =

"Number of rows per batch during restore. Default: "

+ DatabaseBackupRestore.DEFAULT_BATCH_SIZE)

description = "Number of rows per batch during restore. Default: 1000")

Copilot · 2026-03-20T08:28:31Z

openmetadata-service/src/main/java/org/openmetadata/service/util/DatabaseBackupRestore.java

+      TarArchiveEntry entry;
+      while ((entry = tais.getNextEntry()) != null) {
+        String name = entry.getName();
+        if (!name.startsWith("tables/") || !name.endsWith(".json")) {
+          continue;
+        }
+
+        String tableName = name.substring("tables/".length(), name.length() - ".json".length());
+
+        if (!validTables.contains(tableName)) {
+          LOG.warn("Table {} from archive not in metadata, skipping", tableName);
+          continue;
+        }
+
+        JsonNode tableMetaNode = tablesMetadata.get(tableName);
+        if (tableMetaNode == null) {
+          LOG.warn("No metadata found for table {}, skipping", tableName);
+          continue;
+        }
+
+        List<String> columns = new ArrayList<>();
+        tableMetaNode.get("columns").forEach(col -> columns.add(col.asText()));
+
+        Set<String> binaryColumns = new HashSet<>();
+        JsonNode binaryColumnsNode = tableMetaNode.get("binaryColumns");
+        if (binaryColumnsNode != null) {
+          binaryColumnsNode.forEach(col -> binaryColumns.add(col.asText()));
+        }
+
+        LOG.info("Restoring table {}", tableName);
+        int rowCount = insertRowsStreaming(handle, tableName, columns, binaryColumns, tais);
+        LOG.info("Restored table {} ({} rows)", tableName, rowCount);
+      }


restoreTablesFromArchive restores only tables that have a corresponding tables/<name>.json entry in the archive, but it does not validate that every table listed in metadata.json was actually present/restored. A truncated/corrupted backup could therefore “restore successfully” with missing tables/data. Consider tracking restored table names and failing if any tables from tablesMetadata are missing in the archive.

Copilot · 2026-03-20T08:28:31Z

openmetadata-service/src/main/java/org/openmetadata/service/util/DatabaseBackupRestore.java

+        .fieldNames()
+        .forEachRemaining(
+            tableName -> {
+              String sql = String.format("TRUNCATE TABLE %s", quoteIdentifier(tableName));
+              handle.execute(sql);
+              LOG.info("Truncated table {}", tableName);
+            });


truncateAllTables runs TRUNCATE TABLE <table> per table. On Postgres this can fail when there are FK dependencies between tables (TRUNCATE enforces FK reference rules unless you use CASCADE or truncate all related tables in one statement). Consider using TRUNCATE ... CASCADE for Postgres (or truncating all tables in a single TRUNCATE statement) to make --force reliable.

Copilot · 2026-03-20T08:28:32Z

openmetadata-service/src/main/java/org/openmetadata/service/util/DatabaseBackupRestore.java

+      int offset = 0;
+      while (true) {
+        String sql =
+            String.format(
+                "SELECT %s FROM %s%s LIMIT %d OFFSET %d",
+                quotedColumns, quotedTable, orderByClause, batchSize, offset);
+        List<Map<String, Object>> rows = handle.createQuery(sql).mapToMap().list();
+
+        for (Map<String, Object> row : rows) {
+          gen.writeStartObject();
+          for (String col : columns) {
+            Object val = row.get(col);
+            if (val == null) {
+              gen.writeNullField(col);
+            } else if (val instanceof Number number) {
+              if (number instanceof Long l) {
+                gen.writeNumberField(col, l);
+              } else if (number instanceof Integer i) {
+                gen.writeNumberField(col, i);
+              } else if (number instanceof Double d) {
+                gen.writeNumberField(col, d);
+              } else if (number instanceof Float f) {
+                gen.writeNumberField(col, f);
+              } else if (number instanceof BigDecimal bd) {
+                gen.writeNumberField(col, bd);
+              } else {
+                gen.writeNumberField(col, number.longValue());
+              }
+            } else if (val instanceof Boolean b) {
+              gen.writeBooleanField(col, b);
+            } else if (val instanceof byte[] bytes) {
+              gen.writeBinaryField(col, bytes);
+            } else {
+              gen.writeStringField(col, val.toString());
+            }
+          }
+          gen.writeEndObject();
+          rowCount++;
+        }
+
+        if (rows.size() < batchSize) {
+          break;
+        }
+        offset += batchSize;


writeTableToTempFile paginates with LIMIT ... OFFSET ... while increasing offset by batchSize. For large tables this becomes progressively slower (and can be very slow on Postgres/MySQL) because the database must scan/skip offset rows each time. Consider keyset pagination using the primary key(s) (or a stable cursor) to page efficiently, especially since you already discover PK columns.

Suggested change

int offset = 0;

while (true) {

String sql =

String.format(

"SELECT %s FROM %s%s LIMIT %d OFFSET %d",

quotedColumns, quotedTable, orderByClause, batchSize, offset);

List<Map<String, Object>> rows = handle.createQuery(sql).mapToMap().list();

for (Map<String, Object> row : rows) {

gen.writeStartObject();

for (String col : columns) {

Object val = row.get(col);

if (val == null) {

gen.writeNullField(col);

} else if (val instanceof Number number) {

if (number instanceof Long l) {

gen.writeNumberField(col, l);

} else if (number instanceof Integer i) {

gen.writeNumberField(col, i);

} else if (number instanceof Double d) {

gen.writeNumberField(col, d);

} else if (number instanceof Float f) {

gen.writeNumberField(col, f);

} else if (number instanceof BigDecimal bd) {

gen.writeNumberField(col, bd);

} else {

gen.writeNumberField(col, number.longValue());

}

} else if (val instanceof Boolean b) {

gen.writeBooleanField(col, b);

} else if (val instanceof byte[] bytes) {

gen.writeBinaryField(col, bytes);

} else {

gen.writeStringField(col, val.toString());

}

}

gen.writeEndObject();

rowCount++;

}

if (rows.size() < batchSize) {

break;

}

offset += batchSize;

String sql = String.format("SELECT %s FROM %s%s", quotedColumns, quotedTable, orderByClause);

Iterable<Map<String, Object>> rows =

handle.createQuery(sql).setFetchSize(batchSize).mapToMap();

for (Map<String, Object> row : rows) {

gen.writeStartObject();

for (String col : columns) {

Object val = row.get(col);

if (val == null) {

gen.writeNullField(col);

} else if (val instanceof Number number) {

if (number instanceof Long l) {

gen.writeNumberField(col, l);

} else if (number instanceof Integer i) {

gen.writeNumberField(col, i);

} else if (number instanceof Double d) {

gen.writeNumberField(col, d);

} else if (number instanceof Float f) {

gen.writeNumberField(col, f);

} else if (number instanceof BigDecimal bd) {

gen.writeNumberField(col, bd);

} else {

gen.writeNumberField(col, number.longValue());

}

} else if (val instanceof Boolean b) {

gen.writeBooleanField(col, b);

} else if (val instanceof byte[] bytes) {

gen.writeBinaryField(col, bytes);

} else {

gen.writeStringField(col, val.toString());

}

}

gen.writeEndObject();

rowCount++;

Copilot · 2026-03-20T08:28:32Z

openmetadata-service/src/main/java/org/openmetadata/service/util/OpenMetadataOperations.java

+              description =
+                  "Number of rows to read/write per batch. Default: "
+                      + DatabaseBackupRestore.DEFAULT_BATCH_SIZE)


The --batch-size option hardcodes defaultValue = "1000" but the description references DatabaseBackupRestore.DEFAULT_BATCH_SIZE. If the constant ever changes, CLI behavior and help text will diverge. Consider defining a single source of truth (e.g., keep the help text literal, or introduce a shared constant for both).

Suggested change

description =

"Number of rows to read/write per batch. Default: "

+ DatabaseBackupRestore.DEFAULT_BATCH_SIZE)

description = "Number of rows to read/write per batch. Default: 1000")

Copilot · 2026-03-20T08:28:32Z

openmetadata-service/src/main/java/org/openmetadata/service/util/OpenMetadataOperations.java

+              names = {"--batch-size"},
+              defaultValue = "1000",
+              description =
+                  "Number of rows to insert per batch. Default: "
+                      + DatabaseBackupRestore.DEFAULT_BATCH_SIZE)
+          int batchSize) {


The --batch-size option hardcodes defaultValue = "1000" but the description references DatabaseBackupRestore.DEFAULT_BATCH_SIZE. If the constant ever changes, CLI behavior and help text will diverge. Consider defining a single source of truth (e.g., keep the help text literal, or introduce a shared constant for both).

Copilot · 2026-03-20T08:28:32Z

openmetadata-service/src/main/java/org/openmetadata/service/util/MigrationTestRunner.java

+      return null;
+    } catch (Exception e) {
+      LOG.warn("Failed to instantiate test class {}", className, e);
+      return null;


If a migration test class exists but fails to instantiate (constructor throws, linkage error, etc.), loadTestCase logs a warning and returns null, which later gets reported as “(no tests)” instead of a failing test. Consider surfacing this as a failed entry in the summary so broken test implementations are not silently ignored.

Suggested change

return null;

} catch (Exception e) {

LOG.warn("Failed to instantiate test class {}", className, e);

return null;

// No migration test defined for this version; treat as "no tests"

return null;

} catch (Exception e) {

String message = String.format("Failed to instantiate migration test class %s", className);

LOG.warn(message, e);

throw new RuntimeException(message, e);

sonarqubecloud · 2026-03-20T09:22:42Z

Quality Gate passed for 'open-metadata-ingestion'

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

pmbrull and others added 3 commits March 19, 2026 18:35

test: add unit tests for DatabaseBackupRestore.extractDatabaseName

610bb01

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

feat: wire backup and restore commands into OpenMetadataOperations CLI

d4cc51a

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Copilot AI review requested due to automatic review settings March 19, 2026 17:42

github-actions bot added Ingestion safe to test Add this label to run secure Github workflows on PRs labels Mar 19, 2026

Copilot started reviewing on behalf of pmbrull March 19, 2026 17:43 View session

gitar-bot bot reviewed Mar 19, 2026

View reviewed changes

openmetadata-service/src/main/java/org/openmetadata/service/util/DatabaseBackupRestore.java Outdated Show resolved Hide resolved

gitar-bot bot reviewed Mar 19, 2026

View reviewed changes

openmetadata-service/src/main/java/org/openmetadata/service/util/DatabaseBackupRestore.java Show resolved Hide resolved

gitar-bot bot reviewed Mar 19, 2026

View reviewed changes

openmetadata-service/src/main/java/org/openmetadata/service/util/DatabaseBackupRestore.java Show resolved Hide resolved

Copilot AI reviewed Mar 19, 2026

View reviewed changes

pmbrull had a problem deploying to test March 19, 2026 17:52 — with GitHub Actions Error

pmbrull and others added 2 commits March 19, 2026 19:11

feat: add MigrationTestRunner with test discovery, execution, and sum…

6f09fad

…mary output Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

gitar-bot bot reviewed Mar 19, 2026

View reviewed changes

openmetadata-service/src/main/java/org/openmetadata/service/util/MigrationTestRunner.java Show resolved Hide resolved

gitar-bot bot reviewed Mar 19, 2026

View reviewed changes

openmetadata-service/src/main/java/org/openmetadata/service/util/MigrationTestRunner.java Outdated Show resolved Hide resolved

pmbrull temporarily deployed to test March 19, 2026 18:23 — with GitHub Actions Inactive

pmbrull and others added 2 commits March 19, 2026 21:02

pmbrull and others added 5 commits March 19, 2026 21:13

test: add unit tests for quoteIdentifier, quoteColumns, readBackupMet…

8d5267f

…adata, and versionToPackage edge cases Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Copilot AI review requested due to automatic review settings March 20, 2026 08:23

Copilot started reviewing on behalf of pmbrull March 20, 2026 08:23 View session

Copilot AI reviewed Mar 20, 2026

View reviewed changes

pmbrull temporarily deployed to test March 20, 2026 08:33 — with GitHub Actions Inactive

		byte[] content = tais.readAllBytes();
		ArrayNode rows = (ArrayNode) MAPPER.readTree(content);

		"SELECT %s FROM %s LIMIT %d OFFSET %d",
		quotedColumns, quotedTable, BATCH_SIZE, offset);

Conversation

pmbrull commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Usage

Migration Test Framework

Test plan

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🟡 Playwright Results — all passed (17 flaky)

Uh oh!

gitar-bot bot commented Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

sonarqubecloud bot commented Mar 20, 2026

Quality Gate passed for 'open-metadata-ingestion'

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

pmbrull commented Mar 19, 2026 •

edited

Loading

github-actions bot commented Mar 19, 2026 •

edited

Loading

gitar-bot bot commented Mar 20, 2026 •

edited

Loading